Add SGLang Connector for Prefill/Decode Disaggregation #64

bongwoo-bak · 2025-09-25T15:24:09Z

Summary

This PR introduces a new SGLang connector that supports prefill/decode (P/D) disaggregation for the LLM-D routing sidecar. It enables concurrent prefill and decode operations through SGLang’s bootstrap mechanism.

Changes

Added connector_sglang.go implementing P/D disaggregation
Integrated bootstrap configuration (host, port, room)
Updated cmd/llm-d-routing-sidecar/main.go and internal/proxy/proxy.go

Features

Room-based communication for coordinating prefill/decode
Configurable bootstrap via env SGLANG_BOOTSTRAP_PORT(default 8668)
Prefill requests are sent asynchronously, decode requests are sent synchronously and processed upon receiving the decode response

Test

Tested with SGLang prefill/decode services
Confirmed asynchronous prefill & synchronous decode execution
Successfully tested in Kubernetes cluster with AMD MI250 GPUs
Verified integration with Gateway and EPP

internal/proxy/connector_sglang.go

elevran

Couple of hight level comments

we're moving the sidecar to llm-d-inference-scheduler repo (targeting v0.4)
I suspect support of additional inference server would affect additional llm-d components (perhaps require changes in IGW as well)

elevran · 2025-10-26T14:47:42Z

cmd/llm-d-routing-sidecar/main.go

-	vLLMPort := flag.String("vllm-port", "8001", "the port vLLM is listening on")
-	connector := flag.String("connector", "nixlv2", "the P/D connector being used. Either nixl, nixlv2 or lmcache")
+	vLLMPort := flag.String("vllm-port", "8001", "the port vLLM is listening on (also used for SGLang)")
+	connector := flag.String("connector", "nixlv2", "the P/D connector being used. Either nixl, nixlv2, lmcache, or sglang")


Connector represents to the mechanism of transferring KV between P and D instances.
Does using sglang here represent the same concept or is it more an implementation of how the sidecar should communicate with the inferencese server? Based on connector_sglang.go it seem to be similar. Can you please confirm?

elevran · 2025-10-26T14:49:33Z

cmd/llm-d-routing-sidecar/main.go

-	if *connector != proxy.ConnectorNIXLV1 && *connector != proxy.ConnectorNIXLV2 && *connector != proxy.ConnectorLMCache {
-		logger.Info("Error: --connector must either be 'nixl', 'nixlv2' or 'lmcache'")
+	if *connector != proxy.ConnectorNIXLV1 && *connector != proxy.ConnectorNIXLV2 && *connector != proxy.ConnectorLMCache && *connector != proxy.ConnectorSGLang {
+		logger.Info("Error: --connector must either be 'nixl', 'nixlv2', 'lmcache', or 'sglang'")


nit: perhaps its worthwhile to list options in an array (or map) and use that to generate L33, L56 and L57?

elevran · 2025-10-26T14:53:10Z

cmd/llm-d-routing-sidecar/main.go

 	port := flag.String("port", "8000", "the port the sidecar is listening on")
-	vLLMPort := flag.String("vllm-port", "8001", "the port vLLM is listening on")
-	connector := flag.String("connector", "nixlv2", "the P/D connector being used. Either nixl, nixlv2 or lmcache")
+	vLLMPort := flag.String("vllm-port", "8001", "the port vLLM is listening on (also used for SGLang)")


consider generalizing the variable name to inferencePort?

similarly, change the description to a more generalized form?

may want to change the CLI flag to something more generic as well, but that would be a breaking change. We could accept a new flags (e.g., serving-port?), mark this as deprecated and allow the two to coexist for a while (e.g., prioritize use new flag in code, fallback to current if missing and log a deprecation warning).

hhk7734 · 2025-10-28T10:55:38Z

@elevran Thanks for the review.
The concept of SGLangConnector is similar to NixlConnector; it coordinates PD to exchange KV-transfer information between the prefiller and decoder.

Since this repository is deprecated, I’ll close this PR for now and resubmit it to llm-d-inference-scheduler once our team has more bandwidth.

elevran · 2025-10-28T11:45:33Z

thanks so much.
This LGTM and support its inclusion once you have bandwidth to move it over to llm-d-inference-scheduler

add sglang connector

bfd75c6

hhk7734 suggested changes Sep 25, 2025

View reviewed changes

internal/proxy/connector_sglang.go Outdated Show resolved Hide resolved

hhk7734 suggested changes Sep 25, 2025

View reviewed changes

internal/proxy/connector_sglang.go Outdated Show resolved Hide resolved

remove unnecessary logs and simplify code

1406879

elevran reviewed Oct 26, 2025

View reviewed changes

hhk7734 deleted the sglang branch October 28, 2025 10:56

moreh-dev closed this by deleting the head repository Oct 28, 2025

ezrasilvera mentioned this pull request Oct 29, 2025

[EPIC] Support sglang in the inference scheduler llm-d/llm-d-inference-scheduler#394

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SGLang Connector for Prefill/Decode Disaggregation #64

Add SGLang Connector for Prefill/Decode Disaggregation #64

Uh oh!

bongwoo-bak commented Sep 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

elevran left a comment

Uh oh!

elevran Oct 26, 2025

Uh oh!

elevran Oct 26, 2025

Uh oh!

elevran Oct 26, 2025

Uh oh!

hhk7734 commented Oct 28, 2025

Uh oh!

elevran commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add SGLang Connector for Prefill/Decode Disaggregation #64

Add SGLang Connector for Prefill/Decode Disaggregation #64

Uh oh!

Conversation

bongwoo-bak commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elevran left a comment

Choose a reason for hiding this comment

Uh oh!

elevran Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

elevran Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

elevran Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

hhk7734 commented Oct 28, 2025

Uh oh!

elevran commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bongwoo-bak commented Sep 25, 2025 •

edited

Loading